从过去的经验中发现有用的行为并将其转移到新任务的能力被认为是自然体现智力的核心组成部分。受神经科学的启发,发现在瓶颈状态下切换的行为一直被人们追求,以引起整个任务的最小描述长度的计划。先前的方法仅支持在线,政策,瓶颈状态发现,限制样本效率或离散的状态行动域,从而限制适用性。为了解决这个问题,我们介绍了基于模型的离线选项(MO2),这是一个脱机后视框架,支持在连续的状态行动空间上发现样品效率高效瓶颈选项。一旦脱机而在源域上学习了瓶颈选项,它们就会在线转移,以改善转移域的探索和价值估计。我们的实验表明,在复杂的长途连续控制任务上,具有稀疏,延迟的奖励,MO2的属性至关重要,并且导致性能超过最近的选项学习方法。其他消融进一步证明了对期权可预测性和信用分配的影响。
translated by 谷歌翻译
With an increasing amount of data in the art world, discovering artists and artworks suitable to collectors' tastes becomes a challenge. It is no longer enough to use visual information, as contextual information about the artist has become just as important in contemporary art. In this work, we present a generic Natural Language Processing framework (called ArtLM) to discover the connections among contemporary artists based on their biographies. In this approach, we first continue to pre-train the existing general English language models with a large amount of unlabelled art-related data. We then fine-tune this new pre-trained model with our biography pair dataset manually annotated by a team of professionals in the art industry. With extensive experiments, we demonstrate that our ArtLM achieves 85.6% accuracy and 84.0% F1 score and outperforms other baseline models. We also provide a visualisation and a qualitative analysis of the artist network built from ArtLM's outputs.
translated by 谷歌翻译
We identify the task of measuring data to quantitatively characterize the composition of machine learning data and datasets. Similar to an object's height, width, and volume, data measurements quantify different attributes of data along common dimensions that support comparison. Several lines of research have proposed what we refer to as measurements, with differing terminology; we bring some of this work together, particularly in fields of computer vision and language, and build from it to motivate measuring data as a critical component of responsible AI development. Measuring data aids in systematically building and analyzing machine learning (ML) data towards specific goals and gaining better control of what modern ML systems will learn. We conclude with a discussion of the many avenues of future work, the limitations of data measurements, and how to leverage these measurement approaches in research and practice.
translated by 谷歌翻译
Multi-agent artificial intelligence research promises a path to develop intelligent technologies that are more human-like and more human-compatible than those produced by "solipsistic" approaches, which do not consider interactions between agents. Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios. Each scenario pairs a physical environment (a "substrate") with a reference set of co-players (a "background population"), to create a social situation with substantial interdependence between the individuals involved. For instance, some scenarios were inspired by institutional-economics-based accounts of natural resource management and public-good-provision dilemmas. Others were inspired by considerations from evolutionary biology, game theory, and artificial life. Melting Pot aims to cover a maximally diverse set of interdependencies and incentives. It includes the commonly-studied extreme cases of perfectly-competitive (zero-sum) motivations and perfectly-cooperative (shared-reward) motivations, but does not stop with them. As in real-life, a clear majority of scenarios in Melting Pot have mixed incentives. They are neither purely competitive nor purely cooperative and thus demand successful agents be able to navigate the resulting ambiguity. Here we describe Melting Pot 2.0, which revises and expands on Melting Pot. We also introduce support for scenarios with asymmetric roles, and explain how to integrate them into the evaluation protocol. This report also contains: (1) details of all substrates and scenarios; (2) a complete description of all baseline algorithms and results. Our intention is for it to serve as a reference for researchers using Melting Pot 2.0.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Progress in machine learning (ML) comes with a cost to the environment, given that training ML models requires significant computational resources, energy and materials. In the present article, we aim to quantify the carbon footprint of BLOOM, a 176-billion parameter language model, across its life cycle. We estimate that BLOOM's final training emitted approximately 24.7 tonnes of~\carboneq~if we consider only the dynamic power consumption, and 50.5 tonnes if we account for all processes ranging from equipment manufacturing to energy-based operational consumption. We also study the energy requirements and carbon emissions of its deployment for inference via an API endpoint receiving user queries in real-time. We conclude with a discussion regarding the difficulty of precisely estimating the carbon footprint of ML models and future research directions that can contribute towards improving carbon emissions reporting.
translated by 谷歌翻译
ImagEnet-1K是一个通常用于基准测试机器学习(ML)模型的数据集,并评估了诸如图像识别和对象检测等任务。野生动物占Imagenet-1k的27%,但与代表人和物体的类别不同,这些数据尚未受到严格审查。在当前的论文中,我们分析了269个类的13,450张图像,这些图像代表了Imagenet-1K验证集中的野生动物,并参与了专家生态学家。我们发现许多类是不明显或重叠的,并且图像的12%被错误地标记,某些类的图像> 90%的图像不正确。我们还发现,Imagenet-1k中包含的与野生动植物相关的标签和图像都呈现出明显的地理和文化偏见,以及诸如人造动物等歧义,相同图像中的多种物种或人类的存在。我们的发现突出了该数据集的广泛使用来评估ML系统的严重问题,在与野生动植物相关的任务中使用此类算法以及更广泛地创建和策划ML数据集的方式。
translated by 谷歌翻译
在实践中,在实践中应用机器学习算法的瓶颈缺乏大规模标记的数据。转移学习是利用其他数据来改善下游性能的流行策略,但是找到最相关的数据可能是具有挑战性的。神经数据服务器(NDS)是一种为给定的下游任务提供相关数据的搜索引擎,以前已被提议解决此问题。 NDS使用经过数据源培训的专家组合,以估计每个源和下游任务之间的相似性。因此,每个用户的计算成本都随着来源的数量而增长。为了解决这些问题,我们提出了可扩展的神经数据服务器(SND),这是一种大规模搜索引擎,理论上可以索引数千个数据集以将相关的ML数据提供给最终用户。 SND在初始化过程中训练专家在中介数据集上的混合物,并通过与中介数据集的近距离表示数据源和下游任务。因此,随着新数据集添加到服务器中,SNDS用户产生的计算成本仍然固定。我们验证SND在许多现实世界任务上,发现SNDS推荐的数据改善了基线的下游任务性能。我们还通过显示其选择相关数据以在自然图像设置之外传输的能力来证明SND的可伸缩性。
translated by 谷歌翻译
通过提供前所未有的计算资源访问,云计算能够在机器学习等技术中快速增长,其计算需求产生了高能源成本和相应的碳足迹。结果,最近的奖学金呼吁更好地估计AI的温室气体影响:当今的数据科学家无法轻松或可靠地访问该信息的测量,从而排除了可行策略的发展。向用户提供有关软件碳强度的信息的云提供商是一种基本的垫脚石,以最大程度地减少排放。在本文中,我们提供了一个测量软件碳强度的框架,并建议通过使用每个能量单元使用基于位置和特定时间的边际排放数据来测量运行碳排放。我们为一组自然语言处理和计算机视觉的现代模型提供了操作软件强度的测量,以及各种模型尺寸,包括预处理61亿个参数语言模型。然后,我们评估了一套用于减少Microsoft Azure Cloud Compute平台排放的方法套件:使用不同地理区域中的云实例,在一天中的不同时间使用云实例,并在边际碳强度高于某个阈值时动态暂停云实例。我们证实了先前的结果,即数据中心的地理区域在给定云实例的碳强度中起着重要作用,并发现选择合适的区域可能具有最大的运营排放减少影响。我们还表明,一天中的时间对操作软件碳强度有显着影响。最后,我们最终提出了有关机器学习从业人员如何使用软件碳强度信息来减少环境影响的建议。
translated by 谷歌翻译
大型语言模型已被证明可以使用少量学习来实现各种自然语言任务的出色表现,这大大减少了将模型调整到特定应用程序所需的特定任务培训示例的数量。为了进一步了解量表对少量学习的影响,我们培训了一个5400亿个参数,密集激活的变压器语言模型,我们称之为“途径”语言模型棕榈。我们使用Pathways在6144 TPU V4芯片上训练了Palm,这是一种新的ML系统,可在多个TPU POD上进行高效的训练。我们通过在数百种语言理解和产生基准的基准方面实现最先进的学习结果来证明扩展的持续好处。在这些任务中,Palm 540B实现了突破性的表现,在一系列多步推理任务上表现出色,超过了最新的最新表现,并且在最近发布的Big Benchmark上表现优于平均人类表现。大量的大型基础任务显示出与模型量表的不连续改进,这意味着当我们扩展到最大模型时,性能急剧增加。 Palm在多语言任务和源代码生成方面也具有很强的功能,我们在各种基准测试中证明了这一点。我们还提供了有关偏见和毒性的全面分析,并研究了训练数据记忆的程度,相对于模型量表。最后,我们讨论与大语言模型有关的道德考虑,并讨论潜在的缓解策略。
translated by 谷歌翻译